Here we list issues and features of MX software available in the PReSTO installation at NSC Tetralith and LUNARC Cosmos being relevant to be aware of as a user of these installations.
Some CryoSPARC jobtypes such as 3D Classification does not meet the current GPU Usage Efficiency Policy decided at NSC Berzelius. Right now this is frustrating because NSC Berzelius performs extremely well during 3D refinement and other more compute intensive parts of CryoSPARC. PReSTO members from Swedish Cryo-EM community could approach the CryoSPARC developers asking them to deliver more efficient and parallel code during 3D Classification to avoid frustrating job terminations during 3D Classification. Since CryoSPARC is proprietary code, we cannot bluntly make these code updates without permission from the vendor. A potential workaround is using NVIDIA Multi-Instance GPU (MIG) that allows a single GPU to be partitioned into multiple smaller GPU instances - read more
Adding –reservation=safe to your SLURM command during 3D Classification enables you to avoid job termination. The safe reservation was recently introduced by NSC - more info.
Cambride Structural Database (CSD) is used by grade and grade2 from the BUSTER package to make ligand libraries for refinement. The CSD license system make compute nodes non eglible for CSD software packages so use login node when using grade/grade2 or CSD softwares such as Mogul/Mercury/GOLD etc. The CSD license needs to be refreshed from time-to-time so if perhaps grade terminates with ModuleNotFoundError: No module named ‘ccdc’ please open a terminal window and perform ccdc_activator -a –activate -k 614884-8AE19D-4DA7A3-3E9F64-2182E7-8D208A and you are good to go for some time.
While buster-report depend on CSD, buster-report only runs from the Tetralith login node. After a buster job has finished in a directory that we call buster3, a standar buster-report run would be:
module load BUSTER
buster-report -d buster3 -dreport buster3-report
Help and more options buster-report -h
When supplying the CrystFEL GUI with a file list (of HDF5 data files), a failure to read the file list can occur. It appears that the CrystFEL GUI expects an event list (where each event/image within the HDF5 files is individually referenced). Such event lists can be generated with the CrystFEL (command line) utility “list_events” and should resolve the issue.
XDSAPP3 is unable to handle spaces in PATH to .master file during “Load” step, however it can handle spaces in output directory.
XDSAPP3 expects image files named with data in their filename. It would be better if XDSAPP3 check into master.h5 to get the name of the image files because the information is there. Therefore, one cannot exchange crystal_100000X.h5 for crystal_1_data_00000X.h5 because then the info in crystal_1_master.h5 is incorrect and Durin crashes. Instead, one can do links so that XDSAPP3 internal parser get _data in the filename and Durin will also be happy.
In a terminal window enter directory with data and perform
for i in `seq 4`
do
ln -s "crystal_1_00000$i.h5" crystal_1_data_00000$i.h5
done
After soflinking the data can be processed normally!
COOT may quit working if for instance disable Show tips on startup.
The way to recover is rm -rf /home/x_YourUser/.coot-preferences.
COOT startup error message in ccp4i2 after changing Preferences in COOT GUI
Sometimes DIALS terminate with an out-of-memory statement like
Processing sweep SWEEP1 failed: dials.integrate subprocess failed with exitcode 1: see /native/1600/dials/DEFAULT/NATIVE/SWEEP1/integrate/12_dials.integrate_INTEGRATE.log for more details
Error: no Integrater implementations assigned for scaling
Please send the contents of xia2.txt, xia2-error.txt and xia2-debug.txt to:
xia2.support@gmail.com
slurmstepd: error: Detected 2 oom-kill event(s) in step 37678.batch cgroup. Some of your processes may have been killed by the cgroup out-of-memory handler.
DIALS software authors explain the issue/BUG in detail, and the current workaround is to use half of the available cores at various compute nodes i.e.
For instance when using BioMAX offline-fe1 cluster with native.script change two lines of code into:
multiprocessing.nproc=20
sbatch -N1 --exclusive --ntasks-per-node=20 -J DIALS -o "$outdir/dials.out" --wrap="$dials"
Queued jobs can use a single compute node only due to limitations in the code that needs to be redesigned in order for phenix to run on several nodes. A user can allocate more than a single compute node, however only a single node will be used for computing and the other nodes will simply be a waste of compute time. Eventually the improper multi-node phenix job will crash, however sometimes the job will finish and waste significant amounts of compute time that might be unnoticed to newcomers running phenix. Jobs can be submitted to the compute nodes directly from the phenix GUI running at the login node, however keep in mind only using a single node with 47 cores at Cosmos or 31 cores at Tetralith and adapt your allocation times accordingly.
When running Phenix mr_rosetta at BioMAX offline-fe1 cluster the Log output window is blank during place model, however job is indeed running as more easily seen when reaching rosetta rebuild stage.
During Place model a blank Log output window is shown caused by logfile not created when GUI read it. This happen on BioMAX offline cluster running nfs filesystem in /home/DUOusername
By using squeue -u DUOusername one can see that a job exists and by ssh offline-cn1 followed by top -u DUOusername that a process is running.
Here the job reached rosetta rebuild stage and many parallel processes are running in top terminal window
Phenix developers recommend using Phenix 1.19.2-4158 with rosetta compared to using 1.20.1-4487. We can confirm this by successfully run a rosetta.refine test job using Phenix/1.19.2-4158-Rosetta-3.10-8-PReSTO that were indeed failing in Phenix/1.20.1-4487-Rosetta-3.10-5-PReSTO. The “path to Rosetta” setting in Phenix GUI should be: /software/presto/software/Phenix/1.19.2-4158-foss-2019b-Rosetta-3.10-8/rosetta-3.10
Phenix has two wizards for molecular replacement using PHASER software named Phaser-MR (simple one component interface) and Phaser-MR (full-featured). Neither of these two wizards send molecular replacement jobs to the compute nodes although Run-Submit queue job and Action-Queuing system-submit jobs… is available so this is a BUG in Phenix version 1.20.1-4487. The related MRage or Molecular replacement pipeline (advanced) has the same BUG when using Run-Submit queue job, however the Action-Queueing system-submit jobs… option is indeed sending molecular replcement jobs to the compute jobs as intended.
When saving movies in PyMOL 2.5.0 there are three encoders available named mpeg_encode/convert/ffmpeg
mpeg_encode and convert is working in PReSTO installation
- With mpeg_encode one can save MPEG1 movies for PowerPoint
- With convert one can save animated GIFs for web browsers like firefox
ffmpeg option does NOT work in PReSTO installation (yet). We are missing the component in ffmpeg required for the "MPEG 4" and
Quicktime alternatives. Right now the names used in PyMOL is misleading. Both alternatives are trying to create a video in the
format "H.264/MPEG-4 AVC"
meanwhile there are movie converters online for quicktime format for instance.
Guides, documentation and FAQ.
Applying for projects and login accounts.